Next | Prev | Up | Top | Contents | Index

Using Synchronous Writing

When you open a disk file and do not specify the O_SYNC flag, a call to write() for that file returns as soon as the data has been copied to a buffer managed by the device driver (see the open(2) reference page).

The actual disk write may not take place until considerable time has passed. A common pool of disk buffers is used for all disk files. Disk buffering is integrated with the virtual memory paging mechanism. A daemon executes periodically and initiates output of buffered blocks according to the age of the data and the needs of the system.

Tip: The number of disk blocks that are written in each output operation is set by the dwcluster tuning variable. The system administrator can adjust this value with systune (see the systune(1) reference page). The default management of disk output improves performance in general but has two drawbacks:

You can force the writing of all pending output for a file by calling fsync() (see the fsync(2) reference page). This gives you a way of creating a known checkpoint of a file. However, fsync() blocks until all buffered writes are complete, possibly a long time.

When you open a disk file specifying O_SYNC, each call to write() blocks until the data has been written to disk. This gives you a way of ensuring that all output is complete as it is created. If you combine O_SYNC access with asynchronous I/O, you can let the asynchronous process suffer the delay.

The O_SYNC option requires completed output even when the amount of data written is less than the physical blocksize of the disk, or when the output data does not align with the physical boundaries of disk blocks. This can lead to writing and rewriting the same disk blocks, wasting time. A file opened with O_SYNC also copies data to kernel memory before writing.


Next | Prev | Up | Top | Contents | Index